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Abstract 


The study of hydrological safety in reservoirs is carried out with the so- 
called Design Flood Hydrograph. The simplest estimation of such a graph 
is based on the bivariate frequency analysis (BFA), by defining its 
maximum flow (Q) and volume (V), associated with a joint design return 
period. The Copula functions (CF) are based on the dependence between 
Q and V, and define the bivariate distribution by means of the previously 


adopted marginal univariate functions. The practical approach adopted 
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uses CF with a single fit parameter and selects the most appropriate, 
based on the dependence shown by the joint record of Q and V, at the 
far-right end of its empirical distribution. Moreover, the practical approach 
compares and ratifies the adopted CF, against several of common use in 
the BFA. The above, using the fitting errors between the empirical and 
theoretical bivariate probabilities. Due to their importance, the search for 
marginal distributions was carried out based on the L quotient diagram, 
adopting the best three and comparing them to the highly versatile Kappa 
and Wakeby functions. The BFA of the 55 annual floods registered in the 
La Cuña hydrometric station of the Hydrological Region No. 12-3 
(Santiago River), Mexico was carried out. Four joínt design return periods 
were defined and the calculation of their AND curves is detailed. Finally, 
several Conclusions are cited which highlight the advantages of the use 
of CF in flood BFAs. 


Keywords: Families of Copula functions, Kendall's tau quotient, 
Spearmanr's rho coefficient, joint empirical probabilities, right-tail 


dependency, joint return periods, critical events. 


Resumen 


El estudio de la seguridad hidrológica en los embalses se realiza con el 
llamado hidrograma de la creciente de diseño. El proceso más sencillo y 
aproximado para estimar tal gráfica se basa en el análisis de frecuencias 
bivariado (AFB) para definir su gasto máximo (Q) y volumen (V), 
asociados con el periodo de retorno conjunto de diseño. Las funciones 
Cópula (FC) se fundamentan en la dependencia entre Q y V, y definen la 
distribución bivariada por medio de las funciones univariadas marginales 


previamente adoptadas. El enfoque práctico adoptado utiliza FC de un 
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solo parámetro de ajuste y selecciona la más adecuada a partir de la 
dependencia que muestra el registro conjunto de Q y V en el extremo 
derecho de su distribución empírica. Además, contrasta y ratifica la FC 
adoptada contra varias de uso común en los AFB. Lo anterior, por medio 
de los errores de ajuste entre las probabilidades bivariadas empíricas y 
teóricas. La búsqueda de las distribuciones marginales se realizó con base 
en el diagrama de cocientes L para adoptar las tres mejores y 
confrontarlas con las funciones Kappa y Wakeby de gran versatilidad. Se 
realizó el AFB de las 55 crecientes anuales registradas en la estación 
hidrométrica La Cuña de la Región Hidrológica No. 12-3 (río Santiago), 
México. Se definen cuatro periodos de retorno conjuntos de diseño y se 
detalla el cálculo de sus curvas de tipo AND. Por último, se citan varias 
conclusiones que destacan las ventajas del uso de las FC en los AFB de 


crecientes. 


Palabras clave: familias de funciones Cópula, cociente tau de Kendall, 
coeficiente rho de Spearman, probabilidades empíricas conjuntas, 
dependencia en el extremo derecho, periodos de retorno conjuntos, 


eventos críticos. 
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Introduction 


Generalities 


In our country, the Floods caused by rising waters are one of the natural 
disasters that cause the highest economic, environmental and social 
damage to human settlements. Therefore, risk reduction through 
hydraulic works and the elaboration of damage mitigation plans, without 
such structures, have become vital for society. The planning and 
hydrological design of protection works such as dikes and retaining walls, 
rectifications, canalizations, bridges and some urban drainage works, are 
dimensioned based on the Design Floods (CD, by its acronym in Spanish). 
The CDs are maximum annual flows of the river, where the hydraulic 
works will be located, associated with low probabilities of being exceeded 
(Aldama, Ramírez, Aparicio, Mejía-Zermeño, 8: Ortega-Gil, 2006; Rao 81 
Hamed, 2000; Meylan, Favre, € Musy, 2012). 


The most reliable CD estimation is carried out through the univariate 
Flood Frequency Analysis (AFC, by its acronym in Spanish), which 
attempts to represent the maximum annual flow record of a river, with a 
probability distribution function (PDF) or probabilistic model, from where 
the sought predictions or CD are made. AFCs lead to reliable results when 
several conditions are met, among them, that the available record of flows 
is random, that the PDF is the most representative of such data and that 
its fit is made with an efficient and robust method (Hosking 8 Wallis, 
1997; Meylan et a/., 2012; Stedinger, 2017). 
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On the other hand, reservoirs or storages can be classified in a simple 
way, in large, medium and small. The large reservoirs have multiple 
purposes and multi-year operation, and for protection against floods 
among other uses. On the other hand, small reservoirs have a single 
purpose, generally irrigation, and therefore they store water from the 
rainy season for use in the following dry season. Whether large, medium 
or small, the design of reservoirs for hydrological safety requires the 
hydrograph of the CD estimation (Aldama, 2000; Aldama et al., 2006; 
Gómez, Aparicio, € Patiño, 2010). 


The hydrograph is the graphical representation of the flood evolution, 
with time on the abscissa and flow on the ordinate. Then, the area that it 
defines is the drained volume (V) and its amplitudes in time are the total 
duration (D) and in the ordinates the maximum flow (Q); whose time 
elapsed from the start is called peak time (tp). The hydrograph clearly 
establishes that floods are a multivariate hydrological phenomenon, 
whose elements are correlated (Favre, El Adlouni, Perreault, Thiémonge, 
8 Bobée, 2004; Aldama et al., 2006; Gráler et al., 2013; Chowdhary € 
Singh, 2019). 


Aldama (2000) proved that the reservoirs are not sensitive to the tp 
value, and since Q and V are correlated and V is correlated with D, the 
multivariate study of the floods could be reduced to a bivariate analysis 
of Q and V, to roughly define the CD hydrograph. The use of bivariate AFC 
initiated at the end of the last century with the works of Goel, Seth and 
Chandra (1998), and Yue, Ouarda, Bobée, Legendre and Bruneau (1999), 
who processed Q and V data of the annual floods, the first based on the 


bivariate Normal distribution, like Yue (1999), and the latter applying the 
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so-called logistic model (Yue 8 Wang, 2004), with marginal Gumbel 


distributions. 


The first bivariate AFCs were limited by the scarce availability of 
multivariate PDFs, whose first models used were extensions of univariate 
distributions, for example, the  bivariate Normal, Log-normal, 
Exponential, Pareto and Gamma. Favre et al. (2004) indicate that this 
type of distributions have the following three disadvantages: (1) their 
marginal distributions belong to the same family; (2) the extension 
beyond the bivariate case, they are not simple or do not exist and (3) the 
fitting parameters of such marginals are also used to model the 


dependence between the random variables Q and V. 


The limitations and disadvantages of the existing bivariate 
distributions are easily eliminated by applying the so-called Copula 
Functions (CF), which allow modeling the dependency between the 
variables Q and V, in addition to joíning any type of univariate marginal 
distribution previously adopted, to build the new bivariate distribution 
(Shiau, Wang, € Tsai, 2006; Sraj, Bezak, €: Brilly, 2015; Genest 8 
Chebana, 2017; Zhang €: Singh, 2019). 


The first practical applications of CF were logically bivariate and 
began with the works of Favre et al. (2004), and Salvadori and De Michele 
(2004). Then there are the studies by Zhang and Singh (2006), and 
Genest and Favre (2007). Simultaneously, trivariate applications arrive 
with the analyzes of Grimaldi and Serinaldi (2006), and Zhang and Singh 
(2007). Currently, the applications of CF in the analysis of frequencies of 
the various hydrological data have been consolidated, as shown in the 
texts by Salvadori, De Michele, Kottegoda and Rosso (2007), Zhang and 
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Singh (2019), Chen and Guo (2019), and the chapters in Salvadori and 
De Michele (2013), and Chowdhary and Singh (2019). 


In the present study, with the practical CF approach, an operative 
similarity with the univariate AFC is attempted, when performing the 
bivariate ones. To achieve the above, CFs with a single fit parameter are 
used, which have already been tested in bivariate AFCs and whose 
selection is made based on fit indicators and not through random 
simulation. In addition, at the beginning it is defined if the available 
sample of joint data of Q and V, show dependency in the right tail of their 
empirical distribution and if so, CFs that show such dependency are first 
searched and then tested. 


Objectives 


The present study comprises seven objectives: (1) briefly enunciate the 
operative concept of CFs; (2) expose each family of CFs that have been 
used in the AFCs; (3) describe the fit of the CF, based on the association 
measures; (4) explain the concept of right tail dependency, both empirical 
and theoretical, of each CF exposed; (5) select and test the CFs that are 
applicable in the described numerical example; (6) estimate and contrast 
the univariate and bivariate return periods and (7) conduct a numerical 
application with the 55 maximum flows and volumes of the annual floods 
registered at the hydrometric station La Cuña, of Hydrological Region No. 
12-3 (Santiago River), Mexico. 
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Operative theory 


The Copula Functions 


Concept and definition 


The essential characteristic of the Copula Functions (CF) consists in 
allowing to express a joint distribution of correlated random variables, as 
a function of their marginal distributions. So, a CF links or relates the 
univariate marginal distributions to form a multivariate distribution. A 
basic advantage of CFs for forming multivariate distributions is the fact 
that they separate the dependence effect between the random variables 


from the effects of the marginal distributions. 


Due to the above, the construction of the multivariate distribution is 
reduced to studying the relationship between the dependent variables, 
when the univariate marginal distributions are known. In short, the use 
of CF offers complete freedom to adopt or select the best univariate 
marginal models that represent the data (Shiau et a/., 2006; Kottegoda 
8: Rosso, 2008; Meylan et a/., 2012; Zhang € Singh, 2019; Chen € Guo, 
2019). 


As in this study CF will be applied in the bivariate frequency analysis 
of annual floods, the following definition refers to two correlated X and Y 
random variables, whose joint cumulative probability distribution function 
is Fx y(x,y) with univariate marginal distributions Ad(x) and Fy(y); then the 


Copula Function C exists and is such that: 


: ; , : Tecnología y ciencias del agua, ISSN 2007-2422, 
Open Access bajo la licencia CC BY-NC-SA 4.0 (https: //creativecom- 15(2), 01-56. DOI: 10.24850/j-tyca-15-02-01 
mons.org/licenses/by-nc-sa/4.0/) ! " A 


o) 1) Check for updates 
OPEN ACCESS 


Tecnología y 


iencias¿Agua 


Fx y (Xx, y) = C[Fx(), Fy(y)] (1) 


The Copula function C is unique if the random variables X and Y are 
continuous, and otherwise it is uniquely determined on the product of the 
rank of X by the rank of Y. Furthermore, a Copula is a bivariate distribution 


function with uniform marginals at the interval (0, 1). 


Equation (1) defines the basic concept for the development of Copula 
functions and is known Sklar's Theorem exposed in 1959 (Sklar, 1959; 
Shiau et al., 2006; Salvadori et al., 2007; Kottegoda 8 Rosso, 2008; 
Meylan et al., 2012; Chen 8 Guo, 2019; Zhang € Singh, 2019). 


Copula Families 


The Copula Functions (CF) that have been developed have been classified 
into four classes (Meylan et al., 2012; Chowdhary 8 Singh, 2019): 
Archimedean, extreme value, elliptical and miscellaneous. They are also 
classified as one-parameter or multi-parameter copulas, depending on 
the extent or thoroughness to which the dependency structure between 
the variables Q and V is defined. 


Salvadori et al. (2007) provide a comprehensive and useful summary 
of CFs that have been applied in the field of hydrology. In this regard, the 
following relationship has not included the families of Ali-Mikhail-Haq 
Copulas because it limits the dependence to a range of the Kendall tau 
ratio from -0.1817 to 0.3333 and the Farlie-Gumbel-Morgenstern family 
(FGM) which further limits it, to the range of +0.2222. 
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Archimedean copulas have had broad application due to their simple 
construction, single parameter, wide range, and acceptance of both types 
of dependencies. Making Fx(x) = u, Fy(y) = v and 6 the parameter that 
measures the dependence or association between u and v, we have the 
following two families of Archimedean Copulas, which accept negative and 
positive dependence (Shiau et al., 2006; Genest €: Favre, 2007; Salvadori 
et al., 2007; Zhang € Singh, 2019; Chen 8 Guo, 2019; Chowdhary € 
Singh, 2019). 


1. Clayton. This Copula Function is called Cook-Johnson by Zhang and 
Singh (2006). Sraj et al. (2015) cite it with both names. Its equation and 


variation space of O are (Nelsen, 2006): 
C(u, v) = max ([u”? +w"2-1]*%,0) [-1,00)1 (0) (2) 


for 0 = 0, C(u,v) = M(u,v) = uv, is the product copula that uniquely 
defines independence. The relationship of O to Kendall's tau ratio goes as 


follows: 
Ta= 55 (3) 


2. Frank. Its equation and variation space of O are: 


e79U-1)(e797-1) 
e79-1 


Cu,v) = ¿In [1 +! [SONO (4) 
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for 6 = 1, C(u,v) = M(u,v) = uv, independence is defined. The relationship 


of O with Tn is as follows: 
t, =1+5[D,(0)—1] (5) 


where Di1(0) is the Debye function of order 1, whose expression ¡s: 


D,(0) =35 f7 dt (6) 


1 


The previous equation was estimated with numerical integration, 


ratifying its results with the values tabulated by Stegun (1972). 


Next, a family of Extreme Value Copulas is cited, which accepts only 


positive dependence. 


3. Gumbel-Hougaard. Its equation and variation space of 0 are: 
C(u,v) = exp [E Inu)? + In ya [1, 00) (7) 


for 6 = 1, C(u,v) = MN(u,v) = uv, independence is defined. The relationship 


of 6 to Kendall's tau ratio is as follows: 


n= (8) 


Beso 
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Lastly, two families of Miscellaneous class Copulas are cited. 
4. Plackett accepts negative (6 < 1) and positive (9 > 1) dependence: 


14+(0-1)(u+v) [14+(0-1)Ju+v)]2-4uv0 (0-1) 


Clu,v) = 2(0-1) 2(0-1) 


con0<06<0w0 +1 (9) 


For O = 1, C(u,v) = M(u,v) = uv independence is defined. The 


relationship of 6 to Spearman's rho is as follows: 


_ (0+1)  20In(0) 
n (8-1) (6-1)? (10) 


5. Raftery accepts only positive dependence and O varies between O and 
1: 


C(u,v) = min(u, v) + =- (uv) 0-01 — [max(u, NÓ: (11) 


for 6 = 0, C(u,v) = M(u,v) = uv, independence is defined. The relationship 


of 6 to Kendall's tau ratio and to Spearman's rho are as follows: 


tn = 2 (12) 
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0(4-30) (13) 


Pn = “ay 


Families of Associated Copulas 


The four classes of Copula families that exist and for the bivariate case, 
have three associated families, defined by the following strictly increasing 
or decreasing transformations, according to Theorem 2.4.4 in Nelsen 
(2006). Such families are (Salvadori 8: De Michele, 2004; Michiels € De 
Schepper, 2008; Chowdhary € Singh, 2019): 


C'(u,v)=u-—C(ul-—v) (14) 
C"(u,v)=w=C( uv) (15) 
Cu, v) =u+v-1+C(1-u 1-0) (16) 


The first two transformations change the nature of the agreement 
measures; that is, Kendall's tau and Spearman's rho, from negative to 
positive and vice versa. The third transformation is called the Survival 
Copula, because it is related to the joint distribution of the probabilities of 


exceedance, of the pair of random variables u and v whose Copula is C. 


Dupuis (2007) highlighted the importance of the associated Clayton 


Copula, as it has a significant dependence on its right tail defined by the 
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following Equation (17) and whose formula, according to equations (2) 
and (16), is Equation (18): 


Ay =2740 (17) 


Cu,v) =u+v-1+ [41-09 + (1-91 (18) 


Association measures 


Concordance 


Since the Copula Function (CF) characterizes the dependence between 
random variables, it is necessary to study the association measures, to 
count with a method that allows estimating its parameter 6, In general 
terms, a random variable is concordant with another, when its large 
values are associated with the large values of the other and the small 
values of one with the small values of the other. Some variables with 
direct linear correlation will be concordant, since when one increases the 
other also does. Variables with an inverse linear correlation will be 
discordant, since large values of one correspond to small values of the 
other. 


Then, (X1, y1) and (x2, y2) are said to be concordant if x1 < x2 and y1 
< y2, or X1 > X2 and y1 > y2. They are discordant if x1 < x2 and y1 > y2, or 


x1> x2 and y1 < y2. This implies that the pairs (x1 — X2) (Yi — yY2) > O are 
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concordant (c) and discordant (d) when (xi — X2) (y1-— y2) < O (Salvadori 
et al., 2007; Chowdhary € Singh, 2019). 


A numerical association measure is a statistic that indicates the 
degree of dependence or association of the variables. For comparison 
purposes, such measures range from zero to +1 or -1, indicating perfect 
association positive at +1 or negative at -1. Kendall's tau quotient and 
Spearman's rho coefficient are two nonparametric measures that provide 
information on a special form of association or dependence, known as 


concordance (Salvadori et al., 2007). 


Kendall's tau quotient 


It measures the probability of having matching pairs, so it is the ratio 
from c-d to c+d. Its expression to estimate it with bivariate data is (Zhang 
8: Singh, 2006; Zhang €: Singh, 2019): 


ty 5 pa dd signo[ (x; — xXx) 0; 3 y;)] (19) 


in the above equation, n is the number of observations and the sign [:] is 


+1 if such pairs are concordant and -1 if they are discordant. 


Genest and Favre (2007) present a test for the tau ratio, accepting 
that the null hypothesis Ho indicates that X and Y are independent and 
then the statistic has an approximately Normal distribution with zero 
mean and variance 2(2n+5)/[9n(n-1)]. Then, Ho will be rejected with a 


confidence level a = 5 % if: 
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9n(n—-1) 
zen) nl > Za/2 = 1.96 00 


Spearman's rho coefficient 


It measures the correlation between rank pairs (R;, S;) of the continuous 
random variables X; and Y;¡. Therefore, it is equivalent to the Pearson 
correlation coefficient (rxy). Its expression to estimate it in a bivariate 
record of size n is as follows (Chowdhary 8 Singh, 2019; Zhang € Singh, 
2019): 


= a A pe (21) 


Pn = n(n+1)(n-1) Hi=1 n-1 


Genest and Favre (2007) also present a similar test for the rho 
coefficient, whose distribution is close to Normal with zero mean and 


variance 1/(n-1); therefore, Ho is rejected with a level a=5%, if: 


vn — 1: |pnl > 1.96 (22) 


Estimation of the dependency parameter 


The simplest method to estimate the parameter 0 of the Copula Functions 


is similar to the method of moments and is based on the inversion of the 
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equation that relates 0 to Kendall's tau quotient or Spearman's rho 
coefficient. The other two available techniques are called: (1) maximum 
pseudo-likelihood method and (2) exact maximum likelinood method 
(Meylan et al., 2012; Chowdhary € Singh, 2019; Zhang € Singh, 2019; 
Chen € Guo, 2019). In equations (5), (10), (12) and (13) we proceed by 
trial and error to obtain 0; instead, in formulas (3) and (8) its value is 


cleared. 


Estimation of empirical probabilities 


The bivariate empirical probabilities were estimated based on the 
Gringorten formula (Equation (23)), applied by Yue et al. (1999), Chen 
and Guo (2019), and Zhang and Singh (2019). Such a formula ¡s: 


e i-0.44 (23) 


— n+0.12 


in which, í is the number of each data, when they are ordered 
progressively and n is the total number of them, or amplitude of the 
processed register. The previous expression was applied in the two- 
dimensional plane, with the data ordered progressively; the maximum 
flows (Q) in the rows and the volumes (V) in the columns. The plane 
formed is a square of n by n cells, with n cells on its main diagonal, when 
the order number of the row is equal to that of the column. Then each 


pair of annual data (Q and V) is located in the two-dimensional plane and 
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the box defined by the intersection of the row and column is identified 


with the number / that corresponds to the historical year drawn. 


When the n pairs of data are drawn, look for year 1 and define a 
rectangular or square area of minor values of Q and V, whose count of 
numbered cells within, is NM, or combinations of minor Q and V. Once the 
n values of NM; have been calculated, the Gringorten graphical position 


formula is applied to estimate the joint or bivariate empirical probability: 


F(x, y) =P(Q0<qV<n===2 n 


n+0.12 


Selection of the Copula function 


A simple approach to selecting the Copula Function is based on the fit 
error statistics, by comparing the observed empirical probabilities (w?) 
with the theoretical ones calculated (wf) with the Copula Function being 
tested. The indicators applied are the standard mean error (EME, ), the 
absolute mean error (EMA) and the maximum absolute error (EAM); their 
expressions are (Chowdhary € Singh, 2019; Chen € Guo, 2019): 


- Ea (w we) 


EME = (25) 


EMA = 27, (w? —wO)| (26) 
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EAM = max|(w = wj¿)| (27) 


Tail dependence of the CF 


Generalities 


Another criterion that is applied to select a CF is based on the magnitude 
of the dependency in joint distribution's upper tail, which affects the 
accuracy of the extreme predictions. The upper right tail dependency (A1/) 
is the conditional probability that Y is greater than a certain percentile (s) 
of Fr(y), given that X is greater than that percentile in Fx), as Ss 
approaches unity. The lower left tail dependency(A,) compares that Y is 


less than X, when s approaches zero (Chowdhary € Singh, 2019). 


In relation to the exposed CfFs, those of Frank and Plackett have 
insignificant dependencies in their extreme zones: therefore, A = O and 
Ay = 0. In contrast, the Gumbel-Hougaard Copula has significant 


dependence on the upper tail, equal to: 
Ap =2=218 (28) 
Clayton and Raftery Copulas have it in their lower tail and equal to: 


A, =271/0 (29) 
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m=2 (30) 


As already indicated (Equation (17)), the associated Clayton Copula 
has Au =271/8, Families of unexposed Copulas, with Ay >0, correspond to 
Galambos, Genest-Ghoudi, Húsler-Reiss, Joe and t-Student, the latter 
belonging to the elliptic family (Salvadori et al., 2007; Chowdhary 8 
Singh, 2019; Chen 8 Guo, 2019). 


Dupuis (2007) tested six Copula families and found that their ability 
to estimate extreme events ranges from poor to good, in the following 
order: Clayton, Frank, Normal, t-Student, Gumbel-Hougaard, and 
Associated Clayton (Survival Clayton). Similar conclusions are reached by 
Poulin, Huard, Favre and Pugin (2007), when comparing the same six 
families of Copulas and the one called A12, which has significant right tail 


dependency. 


Estimation of the observed dependency 


To address the estimation of the upper tail dependency (Au) shown by the 
available data, the Empirical Copula must first be defined. As the CF that 
characterizes the dependence between the random variables X and Y; 
then the pair of ranges R; and S; coming from such variables, are the 
statistic that retains the greatest amount of information and its scaling 
with the factor 1/(n+1) generates a series of points in the unit square 
[0,1]?, forming the domain of the Empirical Copula (Chowdhary €: Singh, 
2019), defined as follows: 
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Cp(u, 0) = Ela 1 (757 < 7 SD) (31) 


n+1 7 n+1 


In the above equation, 1(*) denotes a function of the random 
variables u and v, which are a continuously increasing transformation of 
X and Y, relative to the empirical probability integrals F.(X) and FA(YN), 


whose equations are: 


_ Rango(X;) 
o o n4+1 


U; = Fa (xi) V; = ma En (Y) (32) 


+1 


Equation (31) is a function from (0, 1)? to (O, 1) that has leaps of 
magnitude 1/n in each pair of the form of Equation (32), for í/ eX1,2,-*, 
ny. Actually, Equation (31) is not a Copula, but a subcopula that serves 


to approximate Copulas (Nelsen, 2006). 


Poulin et al. (2007) use the estimator proposed by Frahm, Junker 
and Schmidt (2005), which is based on a random sample obtained from 


the Empirical Copula, its expression ¡s: 
1 1 1 1 
yd =2- Zexp f E in] Inn yn Gm (33) 


This estimator accepts that the CF can be approximated by one of 
the extreme values class and has the advantage of not requiring a 


threshold value for its estimation. 
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Ratification of the selected Copula Function 


This is the most important stage in the process of practical application of 
the CF, since it is verified that such a model faithfully reproduces the 
observed joint probabilities (Equation (24)). Yue (2000) indicates that a 
simple and practical way of representing the empirical and theoretical 
joint probabilities consists of taking the first one to the abscissa axis and 
the second to the ordinate axis; logically, in such a graph, each pair of 
data defines a point that coincides with or moves away from the line at 
ASS, 

Yue and Rasmussen (2002) apply the Kolmogorov-Smirnov test with 
a significance level (a) of 5 %, to accept or reject the maximum absolute 
difference (dma) between the joint empirical and theoretical probabilities. 
To evaluate the statistic (D,) of the test, the expression proposed by 
Meylan et al. (2012), for a = 5% this is: 


D, == (34) 


n is the number of data. If the dma is less than D,, the adopted CF is 
ratified. 


Bivariate return periods 


The first bivariate return period of the event (X, Y) is defined under the 


OR condition, which indicates that the limits x or y, or both can be 
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exceeded and then, the classical return period equation or inverse 
probability of exceedance will be (Shiau et a/., 2006; Vogel €: Castellarin, 
2017): 


DoS a O E rr (35) 


T PASAVY>y)  1=FxyGoy)  1-C[FXCDFyO0)] 


where, CIFXOO, FM y)] is the selected CF. 


The second bivariate return period of the event (X, Y) is associated 
with the case in which both limits will be exceeded (X > x, Y > y) or AND 
condition, its equation is (Shiau et a/., 2006; Vogel €: Castellarin, 2017): 


e O ORAR (36) 


P(X>xAY>y) = Fx y (%,y) == 1-Fx()-FyON)Y+C[Fx GO, Fy(0)] 


Aldama (2000) obtains the expression Fxy(x,y) of the bivariate 
probability of exceedance through a simple and logical reasoning of 
probabilities applied in the Cartesian plane. Instead, Yue and Rasmussen 
(2002) use the Cartesian plane to define the bivariate event (X, Y) 


conceptually, which can occur in any of the four quadrants. 


The relationship between bivariate and univariate return periods is 
as follows (Yue 8 Rasmussen, 2002; Shiau et al., 2006; Vogel 8 
Castellarin, 2017): 


Txy < min[Tx, Ty] < max[Tx, Ty] < T'xy (37) 
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being: 
A AP A 
XFX) 1-Fx(x) (38) 
A A A 
Y" RO) 1-0) (39) 


Critical or design events 


Volpi and Fiori (2012) highlight that the plot of the AND-type joint return 
period —shown as Figure 1— presents a severe inconsistency as it 
contains, in a bivariate context, univariate critical thresholds. Due to the 
above, such a graph is considered to be made up of two portions, the two 
designated simple (naive part) and the correct one (proper part). The 
straight parts are the tails or straight lines asymptotes to the curved part. 
The probability of an event occurrence or pair of Q and V, is variable in 
the curved part and decreases along the straight part, although all the 
values define the same joint return period. In short, the pairs of values of 
the asymptote lines have low probabilities of occurrence and therefore 
should not be included in the analysis of the search for critical or severe 
peaks (Q and V). For practical purposes, the extreme points of the curved 
part can be defined according to their empirical distribution or close to the 


beginning of the asymptotic lines. 
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Figure 1. Graphs or the four design joint return periods Ty, ontained 
with the Gumbel-Hougaard CF, from the floods of the hydrological 


station La Cuña, Mexico. 


Wald-Wolfowitz test 


This non-parametric test has been exposed and applied by Bobée and 
Ashkar (1991), Rao and Hamed (2000) and Meylan et a/. (2012) to verify 


independence and stationarity in records of maximum annual flows (X;). 
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Due to the above, it was proposed to apply the test to the records of 


maximum flows and annual volumes, which must be random samples. 


Selection of marginal distributions 


Eight applied probability distributions 


The search approach for the ideal marginal distributions took into account 
the statistical characteristics of the hydrological data to be processed, 
maximum flow and volumes of the annual floods. The later, through the 
quotients L of asymmetry (t3) and kurtosis (t4), which allow defining in 
the L-Moment Ratio Diagram of Hosking and Wallis (1997), the best 
distributions, due to their less proximity to the five curves that are 
displayed on such a graph. These distributions have three fit parameters 
and they are: Generalized Logistic (LOG), General Extreme Values (GVE), 
Log-Normal (LGN), Pearson type III (PE3) and Generalized Pareto (PAG). 
The PE3 curve allows testing the Log-Pearson type III (LP3) distribution. 


In addition, two probabilistic models were applied that have shown 
great versatility and universality to represent series of extreme 
hydrological data, due to the fact that they have four and five fit 
parameters; such distributions are the Kappa (Hosking, 1994; Kjeldsen, 
Ahn, 8 Prosdocimi, 2017; Campos-Aranda, 2023) and the Wakeby 
(Hosking € Wallis, 1997). 


The LP3 distribution was the only one that was applied with the 
method of moments in the logarithmic (WRC, 1977) and real (Bobée, 


1975) domains, selecting the one with the best fit. The remaining seven 
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were adjusted to the records of maximum flows and annual volume of 
floods, through the method of moments L, according to procedures 
exposed by Hosking and Wallis (1997). 


L moments of the sample 


The L-moments are linear combinations of the probability-weighted (b;) 
moments, defined by Greenwood, Landwehr, Matalas, and Wallis (1979); 
therefore, they are robust to scattered sample values. Its calculation 
begins by ordering the available series of maximum flows (x;) and annual 
volumes (y;) of the floods from smallest to largest (x, < x, < --* < x,) and 
estimating the b, with the following equation (Hosking 8: Wallis, 1997; Rao 
8: Hamed, 2000; Asquith, 2011; Stedinger, 2017): 


bp=lyr, EA (40) 


n SÍ=T+1 (n-1)(-2)-(n=r) ** 


In the previous expression, the order number r varies from O to 3 
and n is the number of data of the annual series. It follows that bo is equal 
to the arithmetic mean. The moments L of the sample (/) and their 
respective quotients (t) of similarity with the coefficients of variation, 


asymmetry and kurtosis are: 
l, = bo (41) 


(42) 
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la =6-b,—6-b, +bo (43) 
l, =20-b3-30:b, +12: b, —bo (44) 
= ll (45) 
ta =1l3/L, (46) 
n= (47) 


L-Moments Ratio Diagram 


It has t3 on the abscissa axis and t4 on the ordinate axis. The three- 
parameter fit PDFs are curved lines with the following polynomial-type 
equations (Hosking 8 Wallis, 1997): 


Generalized logistic (LOG): 


EL06 = 0.16667 + 0.83333 : tf (48) 


Generalized Pareto (PAG): 
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¿PAG = 0,20196 : tz + 0.95924 - t2 — 0.20096 : ti + 0.04061 -t% (49) 
Log-normal (LGN): 

ELN = 0,12282 + 0.77518 * t2 + 0.12279 : t4 — 0.13638 : t£ + 0.11368 - t£ (50) 
Pearson type III (PT3): 

EP3 = 0,12240 + 0.30115 - t2 + 0.95812 : ti — 0.57488 : t$ + 0.19383 - £$ (51) 
General extreme value (GVE): 

ESE = 0.10701 + 0.11090 : tz + 0.84838 - £2 — 0.06669 - t3 + SF (52) 


being SF = 0.00567 -t3 — 0.04208 : t3 + 0.03763  t£ 


Using the natural logarithms of the data in equations (40) to (47), 
the logarithmic L ratios are obtained, and Equation (51) can then be used 


to estimate the Log-Pearson type ITI distribution. 


Absolute distance 


One of the recent approaches for selecting the best PDF of a data series 


consists of taking the sample values (t3 and t4) to the ratio diagram L and 
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defining their proximity to any of the curves, to obtain the best 
probabilistic model. To avoid subjectivity in the selection of the PDF, it 
has been proposed to evaluate the Absolute Distance (DA) with the 


following expression (Yue 8 Hashino, 2007): 
DA=1I)= E (53) 


where, t3 and t££?* are the asymmetry and kurtosis ratios L of the 
analyzed series and t,(t5?) is the theoretical value of the kurtosis quotient 
L calculated with each PDF (equations (48) to (52)), for the observed 
value of the asymmetry ratio L. A PDF with the smallest DA value is the 


best for the processed data. 


Fit errors 


The first criteria applied for the selection of the best PDF to some available 
data or series were the so-called fit errors (Kite, 1977; Willmott 8 
Matsuura, 2005; Chai € Draxler, 2014). This criterion will be applied after 
selecting the three best PDFs in the L-moment ratio diagram, according 
to their minimum absolute distance and after applying the Kappa and 
Wakeby distributions. 


Changing in equations (25) and (26), the probabilities observed by 
the ordered data of the analyzed series (x; or y;) and the probabilities 
calculated by the estimated values with the PDF that is tested or 


contrasted, the standard error of fit (EEA) and the mean absolute error 
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(EAM). The values that are estimated (2, or 9,) are searched for the same 
probability of non-exceedance, assigned to the data by Gringorten's 


empirical formula (Equation (23)). 


Data to be processed 


The hydrometric station La Cuña with code 12054 and a basin area of 
19097 km, is located in the Verde River of Hydrological Region No. 12-3 
(Santiago River), Mexico. Gomez et al. (2010) expose the record of 
maximum flow in m3/s and annual volumes in millions of m3 (Mm3), from 
the period from 1947 to 2004, with 55 pairs of data (n), since the years 
1983, 1984 and 1985 do not have volume information. Such data is shown 
in Table 1. 
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Table 1. Maximum flows, volumes and their bivariate order numbers of 


the annual floods of the hydrometric station La Cuña, Mexico (Gómez et 


al., 2010). 

Year | Q (m/s) V (Mm»3) NM; Año Q (m*/s) | V(Mm3) | NM; 
1947 784.0 146.80 36 1975 622.1 249.07 40 
1948 736.8 155.12 37 1976 1374.0 527.96 52 
1949 510.0 111.40 28 1977 439.7 111.77 28 
1950 461.0 94.06 24 1978 280.2 66.23 15 
1951 411.0 111.55 27 1979 267.2 45.80 9 
1952 326.0 70.82 19 1980 287.3 99.60 20 
1953 349.8 144.75 28 1981 280.7 28.70 
1954 130.4 23.22 3 1982 156.5 35.37 4 
1955 690.0 203.31 39 1986 698.2 193.51 38 
1956 266.0 106.76 17 1987 184.7 55.39 7 
1957 199.0 45.92 9 1988 595.2 242.21 38 

[| 1958 | 690.0 | 188.71 | 37 | 1989 | 110.2 | 42.49 | 3 | 
1959 340.6 47.91 12 1990 523.9 248.07 37 
1960 249.6 91.58 14 1991 1636.3 443.30 52 

| 1961 | 350.0 | 130.68 | 28 | 1992 | 1168.0 | 172.49 | 40 | 
1962 317.0 51.27 12 1993 295.0 96.50 20 
1963 732.6 127.90 33 1994 212.8 53.55 10 

[| 1964 | 2651 | 82.75 | 15 | 1995 | 367.4 | 114.61 | 27 | 
1965 743.6 295.34 46 1996 144.6 57,43 5 
1966 463.9 202.90 35 1997 78.4 16.55 1 

| 1967 | 1474.99 | 598.38 | 53 | 1998 | 261.9 | 66.17 | 13 | 
1968 323.0 118.25 25 1999 196.3 41.15 7 
1969 160.4 32.22 4 2000 46.8 18.62 1 

[| 1970 | 7638 | 187.75 | 39 | 2001 | 3138 | 75.78 | 18 | 
1971 578.0 166.61 36 2002 319.6 153.79 25 
1972 191.8 26.39 4 2003 621.1 326.28 40 

[| 1973 | 2440.0 | 920.30 | 55 | 2004 | 824.5 | 384.45 | 50 | 
1974 238.4 66.66 13 Med 340.6 111.40 Si 
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Results and their discussion 


Search of the marginal distributions 


Verification of the randomness 


First, the randomness of the records to be processed was verified, based 
on the Wald-Wolfowitz Test; the statistic led to values of 0.284 and 0.213, 
for the flows and volumes in Table 1. 


Calculated absolutes distances 


Table 2 has concentrated the results of equations (48) to (52), 
highlighting the three best PDFs of each data record in Table 1. 


Table 2. Order numbers of the best PDF for the indicated records of the 
hydrometric station La Cuña, Mexico, according to absolute distance in 
the diagram of L ratios (Hosking 8 Wallis, 1997). 


Annual maximum flows (m*/s) Annual volumes (Mm) 

tz = 0.38713 In tz = 0.01249 t3 = 0.43326 In t3 = 0.01382 

ta = 0.25523 In ta = 0.15953 ta = 0.28466 In ta = 0.12466 
LOG distribution 0.0363 (3) LOG distribution 0.0384 (5) 
GVE distribution 0.0179 (2) GVE distribution 0.0240 (3) 
LGN distribution 0.0139 (1) LGN distribution 0.0128 (2) 
PT3 distribution 0.0680 (6) PT3 distribution 0.0755 (6) 
PAG distribution 0.0440 (5) PAG distribution 0.0320 (4) 
LP3 distribution 0.0371 (4) LP3 distribution 0.0022 (1) 
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Distribution of maximum annual flows 


Table 3 shows the adjustment errors and predictions (m3/s) obtained with 
the three best distributions and the Kappa and Wakeby, applied to the 
record of maximum flow in Table 1. It is observed that the two best 
distributions, the LGN and the GVE, report the lowest and similar fitting 
errors; but their predictions are quite different in the four high return 
periods. On the other hand, the fit errors of the Kappa distribution are 
almost identical to those of the LGN, but with intermediate predictions 


and for this reason, it was adopted. 


Table 3. Fit errors and predictions (m3/s) of the three best PDF, the 
Wakeby and the Kappa in the record of maximum annual flows of the 


hydrometric station La Cuña, Mexico. 


EEA EAM Return periods in years 
o (m*/s) | (m*/s) [50 | 100 | 500 [1000 | 5000 | 10000 | 
LOG(3) | 61.3 | 394 | 1816 | 2393 | 4504 | 5902 | 11036 | 14441 | 
GVE (2) 55.7 35.8 1838 | 2372 | 4160 | 5254 8 926 11 172 
LEN(1) | 558 | 36.5 | 1839 | 2293 | 3592 | 4270 | 6167 | 7139 | 
KAPPA 55.7 36.7 1835 | 2335 | 3920 | 4 844 7 784 9 489 
OWAKEBY | 35.2 | 55.7 | 1844 | 2304 | 3646 | 4367 6469 | 7599 | 
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The location (u1), scale (a1) and shape (ki and hi1) parameters of 
the selected Kappa distribution are: 258.7462, 228.381, -0.2685394 and 
0.2888472, whose expression is: 


1/h, 


F(x) = (1 Bn [1 E Eco) (54) 


a1 


Distribution of annual volumes 


In Table 4, similar to the previous one for the annual volumes of Table 1. 
It is observed that the best distribution, the LP3, leads to the lowest fit 
errors; but its predictions are quite lower than those of the GVE. In 
contrast, the Kappa distribution leads to low fit errors and its predictions 
are intermediate, therefore, it was adopted. In this case, the Wakeby 


distribution ratifies the selection. 


Table 4. Fit errors and predictions (Mm) of the three best PDF, Wakeby 
and Kappa in the record of annual volumes of the station hydrometric La 


Cuña, Mexico. 


EEA EAM Return periods in years 
PP | (Mm?) | (Mm?) 50 | 100 | 500 | 1000 | 5000 | 10000. 
COVES) | 166 | 90 | 662 | 888 | 1702 | 2234 | 4158 | sal | 
IND | 145 | 76 167071863 11439 | 1752 | 2661 | 3142 | 
LP3 (1) 10.5 6.3 694 907 1 574 1952 3101 3 736 
KAPPA 15.5 8.1 663 873 1576 2 007 3 459 4 350 
WAKEBY 15.7 8.3 664 872 1 561 1978 3 358 4 193 


seso 
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The location (u2), scale (a2) and shape (k2 and h2) parameters of 


the adopted Kappa distribution are: 60.39858, 78.31082, -0.3155518 
and 0.4287021, whose expression ¡s: 


1/h, 


F() = (1h [1 20201 (55) 


Selection and ratification of the Copula Function 


The bivariate data processing in Table 1 led to the following three 
indicators of association: rxy = 0.9302, 1, = 0.7199 and Pn = 0.9088. 
Equation (20), with n = 55 and the quoted tau, gives a value of 7.76; 
therefore, Kendall's quotient is significant. Equation (22) generates a 
value of 6.68 and so Spearman's coefficient is also significant. Equation 
(33) provides a value for the observed right tail dependence of 1 “= 
0.782. 


On the other hand, Table 5 shows the statistical indicators of fit that 
were obtained by applying the six exposed CF of Clayton, Frank, Gumbel- 
Hougaard (GH), Plackett, Raftery and associated Clayton. In equations 
(25) to (27), the empirical bivariate probabilities were estimated with 
Equation (24) and the theoretical ones with equations (2), (4), (7), (9), 
(11) and (18). 
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Table 5. Statistical indicators of the Copula Functions fit indicated in the 


annual floods of the station hydrometric La Cuña, Mexico. 


Copula 0 EME EAM |DP |DN | MDP MDN Ay 
Clayton (C.) | 5.1394 | 0.0245 | 0.0189 | 28 | 27 |0.0714 | -0.0424 | 0.000 
Frank 12.386 | 0.0227 |0.0169| 29 | 26 |0.0581 | -0.0561 | 0.000 
GH 3.5697 | 0.0255 |0.0192 | 28 | 27 | 0.0541 | -0.0676 | 0.786 
Plackett 75.040 | 0.0247 [0.0187 | 23 | 32 |0.0495 | -0.0647 | 0.000 
Raftery 0.7940 |0.0250 |0.0193| 27 | 28 |0.0746 | -0.0457 | 0.000 
C. Asociada | 5.1394 | 0.0288 [0.0225 | 29 | 26 | 0.0477 -0.0788 | 0.874 


Meaning of the new acronyms: 
DP, DN: Number of positive and negative differences. 


MDP, MDN: Maximum positive and negative differences. 


As the Q and V values of the annual floods recorded at the 
hydrometric station La Cuña, show a magnitude of 1£F“ of 0.782 for the 
dependence observed in its right tail, there is no difficulty in selecting the 
best CF that of Gumbel-Hougaard, for the data in Table 1, because it 
reports a value for the significant dependence on its right end, practically 


the same as that observed or shown by the data. 


When the A“ value is much lower than the Gumbel-Hougaard CF 
one, it is possible to resort to applying the elliptic class CF called t- 
Student; which, according to the results of Dupuis (2007) and Poulin et 


al. (2007) has an equal dependency on both queues, but of lesser value. 


In Table 5 the lowest statistical indicators of the EME and EAM, as 


well as the maximum differences between the bivariate probabilities, have 
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been shown in italics. It is observed that the adopted Gumbel-Hougaard 


CF has fit indicators located between the smallest obtained with the Frank 


CF and the largest generated by the Associated Clayton CF. 


Table 6, Table 7 and Table 8 show a part of the bivariate probabilities 


of non-exceedance, empirical observed (w?) and theoretical calculated 


(w£) with the Frank, Gumbel-Hougaard and Associated Clayton CFs. The 


maximum positive and negative differences are also shaded. 


Table 6. Joint non-exceedance probabilities and their differences for a 


part of the floods of La Cuña station, Mexico; with Frank's CF. 


No. w; W; Differences | No. W; W; Differences 
1 0.6451 | 0.6477 -0.0026 30 0.9354 | 0.9344 0.0011 
5 0.4819 | 0.4872 0.0053 Els 0.0827 | 0.0783 0.0044 
10 0.3004 | 0.3139 0.0134 40 0.0464 | 0.0375 0.0089 
T2 0.6633 0.7194 -0.0561 45 0.1734 0.1704 0.0226 
20 0.6270 | 0.6096 0.0174 50 0.1190 | 0.1169 0.0021 
25 0.6451 | 0.6609 0.0157 55 0.8991 | 0.8410 0.0581 
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Table 7. Joint non—-exceedance probabilities and their differences for a 


part of the floods of the station La Cuña, Mexico; with the Gumbel- 


Hougaard CF. 


No. wi wi Differences | No. wi wi Differences 
1 0.6451 | 0.6512 0.0061 30 | 0.9354 | 0.9520 -0.0166 
5 [| 0.4819 | 0.4765 0.0053 35 | 0.0827 | 0.0777 0.0050 
12 | 0.6633 | 0.7309 0.0676 40 | 0.0464 | 0.0401 0.0063 
15 | 0.5000 | 0.4459 0.0541 45 | 0.1734 | 0.1624 0.0111 
20 | 0.6270 | 0.6116 0.0154 50 | 0.1190 0.1143 0.0048 
25 | 0.6451 | 0.6661 0.0210 55 | 0.8991 | 0.8557 0.0435 
Table 8. Joint non-exceedance probabilities and their differences for a 
part of the floods of La Cuña station, Mexico; with Associated Clayton 
CF. 
No. wi wi Differences | No. wi wi Differences 
1 0.6451 | 0.6540 -0.0088 30 | 0.9354 | 0.9548 0.0194 
5 |0.4819 | 0.4863 0.0044 35 | 0.0827 | 0.0727 0.0100 
12 | 0.6633 | 0.7421 -0.0788 40 | 0.0464 | 0.0286 0.0178 
15 | 0.5000 | 0.4523 0.0477 45 | 0.1734 | 0.1454 0.0280 
20 | 0.6270 0.6170 0.0100 50 | 0,1190 | 0.0944 0.0246 
25 | 0.6451 | 0.6798 0.0346 55 | 0.8991 | 0.8578 0.0414 
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On the other hand, Equation (34) defines D, = 0.1831 and since the 
absolute maximum differences of Table 6, Table 7 and Table 8 are 0.0581, 
0.0676 and 0.0788, the Kolmogorov-Smirnov test confirms the three 
contrasted CF. The correlation coefficients (rxy) between the 55 empirical 
probabilities and the theoretical ones, estimated with the Frank, Gumbel- 
Hougaard and Associated Clayton CFs, are the following: 0.9968, 0.9964 
and 0.9962; therefore, they correspond, in order of magnitude, with the 


maximum differences observed. 


The graphic contrast between both probabilities, to ratify the 
adoption of the Gumbel-Hougaard CF is shown in Figure 2 for the 


complete data in Table 7. 
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Figure 2. Graphical contrast of joint probabilities of the peak flow and 
volumes of the floods of the La Cuña station, Mexico; with the Gumbel- 
Hougaard CF. 
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Bivariate return period Ty, plots 


Considering that a reservoir is going to be built in the riverbed and near 
the site of the La Cuña gauging station, which can be small, medium or 
large, depending on the capacity assigned to it, then bivariate return 
periods are required of type AND of 500, 1 000, 5 000 and 10 000 years 
(Campos-Aranda, 2008). Such magnitudes are estimated with Equation 
(36). For which, volumes and peak flows are arbitrarily selected, to obtain 
their marginal non-exceedance probabilities (equations (54) and (55)) 
and joint (Equation (7)) with the Gumbel-Hougaard CF. Table 9 shows 


results to define the four graphs of Figure 1. 
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Table 9. Pairs of maximum flow and annual volume used to define the 
graphs of the joint return period type AND with the CF of Gumbel- 


Hougaard, in the floods of the hydrometric station La Cuña, Mexico. 


Ter Tor Tor Tor 
500 years 1 000 years 5 000 years 10 000 years 
v Q v Q v Q v Q 
Mm3 m3/s Mm3 m3/s Mm3 m3/s Mm3 m3/s 
0 3920 0 4844 0 7784 0 9489 


s00 | 3919 | 500 | 4843 | 1000 | 7782 | 1500 | 9489 
750 | 3918 | 1000 | 4841 | 1500 | 7782 | 2500 | 9485 
1000 | 3906 | 1250 | 4830 | 2000 | 7773 | 2600 | 9480 
1100 | 3892 | 1400 | 4813 | 2200 | 7763 | 2700 | 9478 
1250 | 3847 | 1500 | 4791 | 2400 | 7742 | 2800 | 9469 
1300 | 3818 | 1550 | 4776 | 2500 | 7727 | 2900 | 9459 
1400 | 3720 | 1600 | 4756 | 2600 | 7705 | 3000 | 9449 | 
1450 | 3630 | 1650 | 4729 | 2700 | 7678 | 3100 | 9436 
1500 | 3475 | 1700 | 4695 | 2800 | 7638 | 3200 | 9421 
1 1525 | 3342 | 1750 | 4649 | 2900 | 7585 | 3300 | 9400 | 
1540 | 3222 | 1800 | 4586 | 3000 | 7511 | 3400 | 9370 
1550 | 3108 | 1850 | 4495 | 3100 | 7405 | 3500 | 9335 
1560 | 2938 | 1875 | 4433 | 3200 | 7241 | 3600 | 9290 
1565 | 2809 | 1900 | 4355 | 3300 | 6963 | 3700 | 9232 
1570 | 2604 | 1925 | 4252 | 3325 | 6860 | 3800 | 9148 
1575 | 2039 | 1950 | 4104 | 3350 | 6731 | 3900 | 9049 


1576 0 1975 3862 3375 6566 4000 8898 
2000 3244 3400 6340 4050 8800 

2007 0 3425 5981 4100 8674 

3450 5183 4150 8515 

3459 0 4200 8306 


4250 7996 
4300 7482 
4325 6935 
4350 0 
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Contrast of bivariate return periods 


Table 10 shows the calculations made to verify Equation (37). It is 
observed and verified that in all cases the Txy is less than the univariate 


return periods (7) and on the contrary, the 7;,, is always greater. 


Table 10. Univariate and bivariate return periods estimated with the 
Gumbel-Hougaard CF in the floods of the hydrometric station La Cuña, 


Mexico. 
T=Tx=Ty  Qr vr CLFX(O,Fr(y)1 Txy Ts 

50 1835 | 663 0.9757031 41.2 63.3 
100 2335 | 873 0.9878718 82.5 127.1 
500 3920 | 1576 0.9975724 411.9 636.3 
1000 4844 | 2007 0.9987859 823.7 | 1272.5 
5000 7783 | 3459 0.9997571 4116.9 | 6364.6 
10000 9489 | 4350 0.9998783 8216.9 | 12700.4 


Selection of design events 


In Figure 1 or in Table 9, infinite pairs of Q and V can be selected, which 
satisfy the joint design return period and which are defined as a subgroup 
of critical pairs, as they are within the curved portion of each graph of the 


Txy, outside the asymptote lines (Volpi € Fiori, 2012). 
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The combinations of peak flow and volume that have the same design 
bivariate return period, establish floods or hydrographs that will produce 
different effects in the reservoir that is designed or revised; adopting for 
security, the one that generates the most critical or severe conditions. To 
form each design hydrograph, there are theoretical and empirical 
methods (Aldama, 2000; Aldama et a/., 2006; Campos-Aranda, 2008; 
Xiao, Guo, Liu, 8 Fang, 2008; Gómez et al., 2010; Gráler et a/., 2013). 


Contrast with results of the bivariate GVE 


Campos-Aranda (2022) processed the records in Table 1, based on the 
bivariate GVE distribution, adjusted by means of maximum likelihood; 
therefore, its marginal functions are of the GVE type and its fit parameters 
are involved in the best fit achieved. Table 11, in rows 1 and 2, shows the 


joint univariate predictions that were obtained. 


Table 11. Predictions obtained with the bivariate GVE and Kappa 
distributions as marginal of the floods reports from the hydrometric 


station La Cuña, Mexico. 


Univariate return periods, in years 
Row | Rec. PDF 


50 100 500 | 1000 5 000 10 000 
1 Q GVEb | 1550 | 1958 | 3254 | 4009 6418 7823 
2 V GVEb | 568 739 1317 1673 2877 3619 
a Q Kappa | 1835 | 2335 | 3920 4844 7784 9489 
4 V Kappa | 663 873 1576 2007 3459 4350 
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On the other hand, rows 3 and 4 of Table 11 indicate the predictions 
adopted in Tables 3 and 4 above. It can be observed, for the indicated 
return periods, that all the new pairs of predictions of Q and V are larger, 
this affects the definition of the critical pairs in Figure 1, which are now 
larger, compared to those obtained when applying the bivariate GVE 


distribution. 


Conclusions 


The fundamental advantage of engaging the Copula Functions (CF) in the 
analysis of bivariate increasing frequencies consists in the ability to easily 
construct the joint probability distribution, based on the same or different 
marginal univariate distributions, even from mixed populations; prior 
estimation of the dependence between the random variables: maximum 


flow (Q) and volume (V) of the annual floods. 


The practical approach to applying CF requires a careful selection of 
marginal distributions, and for this reason, the L-ratio diagram was used 
to search for the three best probability distributions and then contrast 
them, based on their fit errors and magnitude of its predictions and thus 
adopt the most representative ones for the available Q and V records. In 
this contrast, two highly versatile distributions were included, as they 


have four and five fit parameters, the Kappa and Wakeby models. 


This practical approach to CF application, uses single fit parameter 


Copulas (0), which is estimated based on Kendall's tau quotient or 
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Spearman's rho coefficient, which are calculated with the joint registration 
of Q and V. Such an approach first estimates 25“ or right tail dependency 
of the available record. Then a CF is searched that reproduces such value 
of 15%, CFs that do not have significant right tail dependency, such as 
those of Clayton, Frank, Plackett and Raftery, are also applied in order to 


compare and judge the quality of the fit of the previously adopted CF. 


The numerical application described, in the 55 annual data of Q and 
V of the floods registered in the hydrometric station La Cuña of the 
Hydrological Region No. 12-3, of Mexico; showed a reliable reproduction 
of the empirical and theoretical bivariate probabilities, through the 
Gumbel-Hougaard CF, with a linear correlation coefficient of 0.9964. On 
the other hand, in Figure 1, relative to the design joint return periods of 
the AND type, infinite pairs of critical Q and V can be defined, since they 


are in the curved region of each graph. 
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